Bayesian optimistic Kullback–Leibler exploration
نویسندگان
چکیده
منابع مشابه
Bayesian Multi-Scale Optimistic Optimization
where σ T (x) = κ(x,x) − k1:T (x)K−1k1:T (x) and this bound is tight. Moreover, σ T (x) is the posterior predictive variance of a Gaussian process with the same kernel. Lemma 3 (Adapted from Proposition 1 of de Freitas et al. (2012)). Let κ : R × R → R be a kernel that is twice differentiable along the diagonal {(x,x) |x ∈ RD}, with L defined as in Lemma 1.1, and f be an element of the RKHS wit...
متن کاملOptimistic Simulated Exploration as an Incentive for Real Exploration
Many reinforcement learning exploration techniques are overly optimistic and try to explore every state. Such exploration is impossible in environments with the unlimited number of states. I propose to use simulated exploration with an optimistic model to discover promising paths for real exploration. This reduces the needs for the real exploration.
متن کاملOptimistic Bayesian Sampling in Contextual-Bandit Problems
In sequential decision problems in an unknown environment, the decision maker often faces a dilemma over whether to explore to discover more about the environment, or to exploit current knowledge. We address the exploration-exploitation dilemma in a general setting encompassing both standard and contextualised bandit problems. The contextual bandit problem has recently resurfaced in attempts to...
متن کاملRNA-Seq Bayesian Network Exploration of Immune System in Bovine
Background: The stress is one of main factors effects on production system. Several factors (both genetic and environmental elements) regulate immune response to stress. Objectives: In order to determine the major immune system regulatory genes underlying stress responses, a learning Bayesian network approach for those regulatory genes was applied to RNA-...
متن کاملModel based Bayesian Exploration
Reinforcement learning systems are often concerned with balancing exploration of untested actions against exploitation of actions that are known to be good. The benefitof exploration can be estimated using the classical notion of Value of Information — the expected improvement in future decision quality arising from the information acquired by exploration. Estimating this quantity requires an a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2018
ISSN: 0885-6125,1573-0565
DOI: 10.1007/s10994-018-5767-4